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FIG. 2 



GENERATE ENTROPY 
MEASURES. 
200 



FOR EACH 
RELATION: 
202 



j SELECT ATTRIBUTES, 

RANDOMLY OR BY 
I FREQUENCY OF USE OR 
I OTHERWISE. 

204 



SELECT UNIFORM 
SAMPLE OF N TUPLES IN 
SELECTED ATTRIBUTES. 
206 
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FOR EACH SELECTED 
ATTRIBUTE: 
208 



INITIALIZE HASH 
TABLE. 

210 



INSERT VALUE INTO 
HASH TABLE, 
FREQUENCY 1 
218 



INCREMENT NUMBER I 
lOF DISTINCT VALUES I 
D i 
220 i 



FOR EACH VALUE IN 
SELECTED TUPLES: 
212 



NEW 
VALUE/ 
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HASH VALUE. \ 
214 
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VALUE EXISTS 



INCREMENT 
FREQUENCY OF 
VALUE 
216 



NEXT. 
222 



more; 



DONE 



COMPUTE VALUE 
PROBABILITIES P(l) FROM 
VALUE FREQUENCY/D. 
224 



COMPUTE ATTRIBUTE 
SKEW FROM P(l) and! 
D EQ. 1. I 

226 ' 



iMORE 



NEXT. 
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|done 



NEXT. 
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FOR EACH 
RELATION: 
300 



SELECT UNIFORM RANDOM 
SAMPLE OF TUPLES OF 
ALL N ATTRIBUTES OF 
RELATION. 
302 



SELECT RANDOM SAMPLE OF 
X*N (INTEGER X) OF THE 
N*(N.1)/2 2-DIMENSIONAL 
SUBSPACES OF N ATTRIBUTES. 
304 



FOR EACH 2-D 
SUBSPACE: 
306 



COMPUTE INFORMATION 
GAIN (EG. 7) OF SUBSPACE, 
CONDITIONALLY ADD TO 
SET M(2). 
308 



STORE INDEXES FOR 
SUBSPACES IN ALL 
SETS M(l). 
316 



MORE/ 



NEXT. 
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DONE 


DONE. 




SELECT ALL l-DIMENSIONAL 
SUBSPACES THAT ARE 
SUPERSPACES OF 
SUBSPACES IN SET M{l-1). 
318 



COMPUTE MUTUAL 
INFORMATION GAIN (EQ. 8) OF 
ALL SELECTED I-DIMENSIONAL 
SUBSPACES, CONDITIONALLY 
ADD TO SET M{l). 
320 



DELETE FROM M(M), ALL 
SUBSPACES HAVING 
I-DIMENSIONAL 
SUPERSPACES IN SET M(l). 
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INCREMENT I. i 
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GENERATE STATISTICS 
FOR SELECTION 
CRTERIA{ON) IN A 
RELATION. 
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ANY INDEX FOR 
ATTRIBUTE(S) IN 
CRITERIA(ON)? 
402 



VALIDATE STATISTICS 
FOR EACH SUBSPACE 
USING INDEX (FIG. 5). 
418 



No 



IDENTIFY SUBSPACE 
COMBINATION. 
406 



Yes> 



INDEXES 
AVAILABLE FOR 
ALL SUBSPACES? 
408 



No 



Yes 



ANY SUBSPACE 
COMBINATIONS \ 
REMAINING TO CHECK./ 
410 



COMPUTE STATISTIC FOR 
SELECTION CRITERIA(ON) 
BY ASSUMING 
INDEPENDENCE. 
420 



No 



SELECT SUBSPACE 
COMBINATION HAVING 
CLOSEST AVAILABLE 
INDEXES, 
412 



VALIDATE 
STATISTICS USING 
INDEX (FIG. 5). 
404 
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VALIDATE STATISTICS 
FOR EACH SUBSPACE 
USING INDEX (FIG. 5). 
414 



COMPUTE STATISTIC FOR 
SELECTION CRITERIA(ON) BY 
ESTIMATING UNINDEXED 
ATTRIBUTES, ASSUMING 
INDEPENDENCE. 
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VALIDATE 
STATISTICS USING 
INDEX. 
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/ ENTROPY DATA 
/ AVAILABLE FOR 
\ INDEXED ATTRIBUTE(S) 
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USE PRIOR STATISTICS 

FOR INDEXED 
ATTRIBUTE(S). 
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No 



Yes 



IS ENTROPY 
GREATER THAN 
THRESHOLD? 
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Yes 



REVALIDATE 
STATISTICS USING 
INDEX. 
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